R-MAT: A Recursive Model for Graph Mining
نویسندگان
چکیده
How does a ‘normal’ computer (or social) network look like? How can we spot ‘abnormal’ sub-networks in the Internet, or web graph? The answer to such questions is vital for outlier detection (terrorist networks, or illegal money-laundering rings), forecasting, and simulations (“how will a computer virus spread?”). The heart of the problem is finding the properties of real graphs that seem to persist over multiple disciplines. We list such “laws” and, more importantly, we propose a simple, parsimonious model, the “recursive matrix” (R-MAT) model, which can quickly generate realistic graphs, capturing the essence of each graph in only a few parameters. Contrary to existing generators, our model can trivially generate weighted, directed and bipartite graphs; it subsumes the celebrated Erdős-Rényi model as a special case; it can match the power law behaviors, as well as the deviations from them (like the “winner does not take it all” model of Pennock et al. [20]). We present results on multiple, large real graphs, where we show that our parameter fitting algorithm (AutoMAT-fast) fits them very well.
منابع مشابه
Tools for graph mining
Large real-world graphs often show interesting properties, such as power-law degree distributions and very small diameters. Discovering such patterns and regularities has a wide range of potential applications. It could help us with detecting outliers or abnormal subnetworks (such as terrorist networks or illegal money-laundering rings), maximizing e ciency of disease controlling, marketing, fo...
متن کاملAutomatic Discovery of Technology Networks for Industrial-Scale R&D IT Projects via Data Mining
Industrial-Scale R&D IT Projects depend on many sub-technologies which need to be understood and have their risks analysed before the project can begin for their success. When planning such an industrial-scale project, the list of technologies and the associations of these technologies with each other is often complex and form a network. Discovery of this network of technologies is time consumi...
متن کاملPredicting Implantation Outcome of In Vitro Fertilization and Intracytoplasmic Sperm Injection Using Data Mining Techniques
Objective The main purpose of this article is to choose the best predictive model for IVF/ICSI classification and to calculate the probability of IVF/ICSI success for each couple using Artificial intelligence. Also, we aimed to find the most effective factors for prediction of ART success in infertile couples. MaterialsAndMethods In this cross-sectional study, the data of 486 patients are colle...
متن کاملOutlier detection in financial statements: a text mining method
This paper presents a text mining methodology to extract outlying knowledge from a collection of financial statements. The main idea is to extract relevant financial performance indicators and discover implicit textual description of the indicators. The extracted information was represented using a network language i.e. conceptual graph. Outlier mining was performed on the conceptual graph repr...
متن کاملVisualization of large networks with min-cut plots, A-plots and R-MAT
What does a ‘normal’ computer (or social) network look like? How can we spot ‘abnormal’ sub-networks in the Internet, or web graph? The answer to such questions is vital for outlier detection (terrorist networks, or illegal money-laundering rings), forecasting, and simulations (“how will a computer virus spread?”). The heart of the problem is finding the properties of real graphs that seem to p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004